Load the necessary libraries
library(rmarkdown) #to render rmarkdown documents
library(tidyverse) #for data wrangling and plotting
Reproducible research is a data analyses concept that promotes publishing of all analysis source, outcomes and supporting commentary (such as a description of methodologies and interpretation of results) in such a way that maximises reproducibility of the findings for verification or mimicry.
Reproducible research is about preserving as much of the analysis pathway as possible to maximise the likelihood that the analyses can be replicated by others or even yourself in future. This preservation involves bundling up associated sources of data, code and commentary, and this can be achieved via a number of means (the following list is an overview - the items will be expanded on later):
pandocBoth LaTeX and HTML are markup languages. They both have standardized short-hand syntax to specify how content should be styled and formatted. Markdown is another markup language with its own specific syntax, yet is far simpler and less verbose than either LaTeX or HTML. The goal of markup languages is to provide simple styling rules and syntax so as to allow the author to concentrate on the content. To this end, the highly simplified syntax of the markdown language makes it one of the briefest and content rich formats. Unlike, many other markup languages (such as LaTeX and HTML), carriage returns and spaces form an important part of the language structure and thus influence the formatting of the final document.
To gain an appreciation of some of the simple styling rules of a markdown document, consider the following:
---
title: Example markdown
author: D. Author
date: 16-06-2020
---
This is the title
=====================
## Section 1
A paragraph of text containing a word that is **emphasised** or ~~strikethrough~~.
Followed by an unordered list:
- item 1
- item 2
Or perhaps an enumerated list:
1. item 1
2. item 2
### Subsection 1.1
There might be a [link](https://www.markdownguide.org/) or even a table:
+-----------+---------+-----------------------+
| Item | Example | Description |
+===========+=========+=======================+
| numeric | 12.34 | floating point number |
+-----------+---------+-----------------------+
| character | 'Site' | words |
+-----------+---------+-----------------------+
| ... | | |
+-----------+---------+-----------------------+
Even in plain text, the general formatting is obvious. This simplicity also makes markdown an ideal language for acting as a base source from which other formats (such as PDF, HTML, Presentations, Ebooks) can be created as well as a sort of conduit language through which other formats are converted.
Pandoc is a universal document converter that converts between one markup language and another. Specifically, Pandoc can read markdown and subsets of the following formats:
Pandoc can write the following formats:
By way of example, the above markdown can be rendered into multiple popular formats via pandoc.
pandoc -o example1.pdf example1.md
pandoc -s -o example1.html example1.md
pandoc -o example1.docx example1.md
Many of the above markup languages feature extensive definitions for styling and formatting rules that do not have direct equivalents within other languages. For example, Cascading Style Sheets and Javascript within HTML provide advanced styling and dynamic presentation of content that cannot be easily translated into other languages. Similarly, there are many macros available for LaTeX that enhance the styling and formatting of content relevant to PDF. Consequently, not all of the more advanced features of each of the languages are supported by Pandoc for conversion.
Pandoc fully supports markdown as an input language, making markdown a popular base language to create content from which other formats can be generated. For example, contents authored in markdown can then be converted into PDF, HTML, HTML presentations, eBooks and others. There are currently numerous dialects of the markdown language. Pandoc has its own enhanced dialect of markdown which includes syntax for bibliographies and citations, footnotes, code blocks, tables, enhanced lists, tables of contents, embedded LaTeX math.
This tutorial will focus on markdown as a base source language from which PDF, HTML, presentations and eBooks are created. As a result, the tutorial will focus on Pandoc’s enhanced markdown. That said, from now on, we will not use pandoc directly - rather we will employ specific R functions that engage with pandoc as part of their overall processing.
Rather than introduce the structural elements of markdown and the intricacies of the pandoc tool in abstract terms, the main features will be The pandoc engine described and demonstrated in an R context with Rmarkdown.
You may have noticed in the example above that at the top of the markdown there were a block of lines starting with three hypens (---) and ending with three hyphens (---). When processed via pandoc, these lines define the document’s meta data (such as the title, author and creation date).
The meta data are a set of key value pairs in YAML format. The list of useful metadata depends on the intended output.
The following rules can be applied to yield different outcomes:
The three fields must be in order of title, author(s), date with each on a separate line
When omitting a field, the field must be left as a line just containing the % character
Multiple authors can be defined by either:
; (semicolon) character---
title: This is the title
author:
- name D. Author
- name D. Other
date: 14-02-2013
---
In addition to the above metadata fields, the YAML header provides a mechanism for storing processing preferences. For example, output dependent options can be specified by indenting each of the options under the output format (the following example indicates that html documents should have a table of contents.
---
title: This is the title
author: D. Author
date: 14-02-2013
output:
html_document:
toc: yes
---
Note, YAML formatting is very particular. Indentation must be via spaces (not tabs).
Since most of the metadata fields are specific to output behaviours, we will illustrate other fields when describing the associated outputs.
Brief changes to font styles within a block of text can be effective at emphasizing or applying different meanings to characters. Common text modifier styles are: italic, bold and strikethrough.
| Markdown | Result |
|---|---|
| *Italic text* or _Italic text_ | Italic text |
| **Bold text** or __Bold text__ | Bold text |
| ~~Strikethrough~~ | |
| `Monospace font` | Monospaced font |
| superscript^2^ | superscript2 |
| subscript~2~ | subscript2 |
If the content to be raised or lowered (for super- and sub- scripts) contains spaces, then they must be escaped by proceeding the space with a  character. For example, Effect~Oxygen\ concentration~ equates to EffectOxygen concentration.
Note, underlined text is not defined in any dialect of markdown (including pandoc markdown) as the developers believe that the underline style is a relic of the days of typewriters when there where few alternatives for emphasizing words. Furthermore, underlining of regular words within a sentence tends to break the aesthetic spacing of lines.
Horizontal lines are indicated by a row of three or more *, - or _ characters (optionally separated by spaces) with a blank row either side.
---
The rate of oxygen consumption (O~2~ per min^-1^.mg^2^) ...
Effect~Oxygen\ concentration~
Markdown (*.md)
---
title: Example markdown
author: D. Author
date: 16-06-2020
---
This is the title
=====================
A paragraph of text containing a word that is **emphasised**, ~~strikethrough~~
and `Monospace`.
___
The rate of oxygen consumption (O~2~ per min^-1^.mg^2^)
Effect~Oxygen\ concentration~
***
pandoc -o example2.pdf example2.md
pandoc -s -o example2.html example2.md
pandoc -o example2.docx example2.md
Pandoc markdown supports two heading formats (pandoc markdown headings must be proceeded by a blank line):
Setext-style headings. Level 1 headings are specified by underlining the heading with a row of = characters and level 2 headings are specified by underlining with a row of - characters.
Setext-style headings only support level 1 and level 2 headings.
Section 1
===========
Subsection
------------
### Subsubsection
# Section 2
## Subsection
### Subsection
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
Section 1
============
Subsection
-----------
Section 2
===========
pandoc -o example3a.pdf example3a.md
pandoc -s -o example3a.html example3a.md
pandoc -o example3a.docx example3a.md
# characters followed by the heading text.Section 1
===========
Subsection
------------
### Subsubsection
# Section 2
## Subsection
### Subsection
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
## Subsection
### Subsubsection
# Section 2
pandoc -o example3b.pdf example3b.md
pandoc -s -o example3b.html example3b.md
pandoc -o example3b.docx example3b.md
A table of contents can be included by issuing the --toc command line switch to pandoc. For some output formats (such as HTML), a block of links to section headings is created, whilst for others (such as LaTeX), an instruction (\tableofcontentsfor the external driver to create the table of contents is generated.
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
## Subsection
### Subsubsection
# Section 2
pandoc --toc -o example3b.pdf example3b.md
pandoc -s --toc -o example3b.html example3b.md
pandoc --toc -o example4.docx example3b.md
Normal text
> This is a block quotation. Block quotations are specified by
> proceeding each line with a > character. The quotation block
> will be indented.
>
> To have paragraphs in block quotations, separate paragraphs
> with a line containing only the block quotation mark character.
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
> This is a block quotation. Block quotations are specified by
> proceeding each line with a > character. The quotation block
> will be indented.
>
> To have paragraphs in block quotations, separate paragraphs
> with a line containing only the block quotation mark character.
Block quotations in pandoc markdown follows email conventions - that is, each line is proceeded by a > character.
pandoc -o example5.pdf example5.md
pandoc -s -o example5.html example5.md
pandoc --toc -o example5.docx example5.md
Verbatim blocks are typically used to represent blocks of code syntax. The text within the verbatim block is rendered literally as it is typed (retaining all spaces and line breaks) and in monoscript font (typically courier). In pandoc markdown, verbatim text blocks are specified by indenting a block of text by either four spaces or a tab character. Within verbatim text, regular pandoc markdown formatting rules (due to spaces etc) are ignored.
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
a = rnorm(10,5,2)
for (i in 1:10) {
print(a[1])
}
pandoc -o example6.pdf example6.md
pandoc -s -o example6.html example6.md
pandoc --toc -o example6.docx example6.md
Alternatively, verbatim blocks can be specified without indentation if the text block is surrounded by a row of three or more ~ characters. This format is often referred to as fenced code.
Normal text
~~~~
a = rnorm(10,5,2)
for (i in 1:10) {
print(a[1])
}
~~~~
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
~~~
a = rnorm(10,5,2)
for (i in 1:10) {
print(a[1])
}
~~~
pandoc -o example7.pdf example7.md
pandoc -s -o example7.html example7.md
pandoc --toc -o example7.docx example7.md
There are three basic list environments available within pandoc markdown:
A bullet list item begins with either a *, + or - character followed by a single space. Bullets can also be indented.
Bullet list
* This is the first bullet item
* This is the second.
To indent this sentence on the next line,
the previous line ended in two spaces and
this sentence is indented by four spaces.
* This is the third item
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
* This is the first bullet item
* This is the second.
To indent this sentence on the next line,
the previous line ended in two spaces and
this sentence is indented by four spaces.
* This is the third item
pandoc -o example8.pdf example8.md
pandoc -s -o example8.html example8.md
pandoc --toc -o example8.docx example8.md
An ordered list item begins with a number followed by a space. The list enumerator can be a decimal number or a roman numeral. In addition to the enumerator, other formatting characters can be used to further define the format of the list numbering.
Ordered list
1. This is the first numbered item.
2. This is the second.
1. This is the third item. Note that the number I supplied is ignored
(i) This is list with roman numeral enumerators
(ii) Another item
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
1. This is the first numbered item.
2. This is the second.
1. This is the third item. Note that the number I supplied is ignored
# Section 2
(i) This is list with roman numeral enumerators
(ii) Another item
pandoc -o example9.pdf example9.md
pandoc -s -o example9.html example9.md
pandoc --toc -o example9.docx example9.md
Note that only the value of the number used for the first item is considered. For subsequent list items the value of the numbers themselves are ignored, they are merely used to confirm that the list items have the same sort of enumerator.
Definition list
Term 1
: This is the definition of this term
This is a phrase
: This is the definition of the phrase
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
Term 1
: This is the definition of this term
This is a phrase
: This is the definition of the phrase
pandoc -o example10.pdf example10.md
pandoc -s -o example10.html example10.md
pandoc --toc -o example10.docx example10.md
To include multiple paragraphs (or other blocked content) within a list item or nested lists, the content must be indented by four or more spaces from the main list item.
Nested lists
1. This is the first numbered item.
2. This is the second.
i) this is a sub-point
ii) and another sub-point
1. This is the third item. Note that the number I supplied is ignored
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
1. This is the first numbered item.
2. This is the second.
i) this is a sub-point
ii) and another sub-point
1. This is the third item. Note that the number I supplied is ignored
pandoc -o example11.pdf example11.md
pandoc -s -o example11.html example11.md
pandoc --toc -o example11.docx example11.md
Normally, pandoc considers a list as complete when a blank line is followed by non-indented text (as markdown does not have starting and ending tags). However, if you wish to place indented text directly after a list, it is necessary to provide an explicit indication that the list is complete. This is done with the <!– end of list –> marker.
Similarly, if you wish to place one list directly following on from another list, a <!– –> marker must be used between the two lists so as to explicitly separate them.
1. This is the first numbered item.
2. This is the second.
1. This is the third item. Note that the number I supplied is ignored
<-- --!>
1. Another list.
2. With more points
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
1. This is the first numbered item.
2. This is the second.
1. This is the third item. Note that the number I supplied is ignored
<!-- -->
1. Another list.
2. With more points
pandoc -o example12.pdf example12.md
pandoc -s -o example12.html example12.md
Not sure why this does not work for word…
pandoc -o example12.docx example12.md
As markdown is a very minimalist markup language that aims to be reasonably well formatted even read as plain text, table formatting must be defined by layout features that have meaning in plain text.
Table captions can be provided by including a paragraph that begins with either Table: or just :. Everything prior to the : will be stripped off during processing.
The number of columns as well as column alignment are determined by the relative positions of the table headings and dashed row underneath:
The table must finish in either a blank line or a row of dashes mirroring those below the header followed by a blank row.
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
Table: A description of the table
Column A Column B Column C
--------- ---------- ---------
Category 1 High 100.00
Category 2 High 80.50
--------- ---------- ---------
pandoc -o example13a.pdf example13a.md
pandoc -s -o example13a.html example13a.md
Note simple tables do not render well in Libre Office. The DOCX thumbnail presented below is generated by converting the DOCX to a png image using unoconv. As this is a command line tool that is part of the Libre Office family, the resulting thumbnail will not render the table correctly. The actual DOCx will nevertheless render fine within either Microsoft Word or WPS Office.
pandoc --toc -o example13a.docx example13a.md
Simple tables can be extended to allow cell contents to span multiple lines. This imposes the following additional layout requirements:
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
Table: A description of the table
--------------------------------
Column A Column B Column
C
--------- ---------- ---------
Category 1 High 100.00
High 95.00
Category 2 High 80.50
High 82.50
--------------------------------
pandoc -o example13b.pdf example13b.md
pandoc -s -o example13b.html example13b.md
pandoc --toc -o example13b.docx example13b.md
Grid tables have a little more adornment in that they use characters to mark all the cell boundaries. However, by explicitly defining the bounds of a cell, grid tables permit more complex cell contents. A grid table for example, can contain a list or a code block etc.
Cell corners are marked by + characters and the table header and main body are separated by a row of = characters.
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
Table: A description of the table
+---------------+---------------+--------------------+
| Fruit | Price | Advantages |
+===============+===============+====================+
| Bananas | $1.34 | - built-in wrapper |
| | | - bright color |
+---------------+---------------+--------------------+
| Oranges | $2.10 | - cures scurvy |
| | | - tasty |
+---------------+---------------+--------------------+
Table: Another table
+-----------+----------+-----------+
|Column A |Column B | Column C|
+===========+==========+===========+
|Category 1 |100.00 | - point A |
| | | - point B |
+-----------+----------+-----------+
|Category 2 | 85.00 | - point C |
| | | - point D |
+-----------+----------+-----------+
pandoc -o example13c.pdf example13c.md
pandoc -s -o example13c.html example13c.md
pandoc --toc -o example13c.docx example13c.md
Although, grid tables require substantially more setup, emacs users will welcome that grid tables are compatible with emacs table mode.
Finally, there are also pipe tables. These are somewhat similar to grid tables in requiring a little more explicit specification of cell boundaries, however, unlike grid tables, they have a means to configure column alignment. Cell alignment is specified via the use of : characters (see example below).. Nor is it necessary to indicate cell corners.
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
Table: A description of the table
| Default | left | Center | Right |
|---------|:------|:------:|-------:|
| High | Cat 1 | A | 100.00 |
| High | Cat 2 | B | 85.50 |
| Low | Cat 3 | C | 80.00 |
pandoc -o example13d.pdf example13d.md
pandoc -s -o example13d.html example13d.md
pandoc --toc -o example13d.docx example13d.md
Note pipe tables do not render well in Libre Office. The DOCX thumbnail presented below is generated by converting the DOCX to a png image using unoconv. As this is a command line tool that is part of the Libre Office family, the resulting thumbnail will not render the table correctly. The actual DOCx will nevertheless render fine within either Microsoft Word or WPS Office.
Images are not displayed in plain text (obviously). However, an image link in pandoc markdown will insert the image into the various derivative document types (if appropriate), Image links are defined in a similar manner to other links, yet preceded immediately by a ! character.

#OR
![label]
[label]: filename

Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
Include the JPEG figure
{width=50%}
And a PNG figure
{width=60%}
pandoc -o example14.pdf example14.md
pandoc -s -o example14.html example14.md
pandoc -o example14.docx example14.md
Markdown leverages TeX math processing. Whilst this does technically break the rules that promote source documents that are readable in text only mode, the payoff is that math is rendered nicely in the various derivative documents (such as pdf or html). In fact, math are passed straight through to the derivative document allowing that document (or is reader) to handle TeX math as appropriate.
Inline math is defined as anything within a pair of $ characters and for math in its own environment (paragraph), use a pair of $$ characters.
The formula, $y=mx+c$, is displayed inline.
Some symbols and equations (such as
$\sum{x}$ or $\frac{1}{2}$) are rescaled
to prevent disruptions to the regular
line spacing.
For more voluminous equations (such as
$\sum{\frac{(\mu - \bar{x})^2}{n-1}}$),
some line spacing disruptions are unavoidable.
Math should then be displayed in display mode.
$$\sum{\frac{(\mu - \bar{x})^2}{n-1}}$$
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Section 1
The formula, $y=mx+c$, is displayed inline.
Some symbols and equations (such as
$\sum{x}$ or $\frac{1}{2}$) are rescaled
to prevent disruptions to the regular
line spacing.
For more voluminous equations (such as
$\sum{\frac{(\mu - \bar{x})^2}{n-1}}$),
some line spacing disruptions are unavoidable.
Math should then be displayed in displayed mode.
$$\sum{\frac{(\mu - \bar{x})^2}{n-1}}$$
pandoc -o example15.pdf example15.md
pandoc --mathjax -s -o example15.html example15.md
pandoc -o example15.docx example15.md
Note not all math are rendered correctly in Libre Office. The DOCX thumbnail presented below is generated by converting the DOCX to a png image using unoconv. As this is a command line tool that is part of the Libre Office family, the resulting thumbnail will not render some of the equations correctly. The actual DOCx will nevertheless render fine within either Microsoft Word or WPS Office.
Links are the linking of information and content between different parts of a document or even between documents. Links provide clickable links to internal or external content.
Internal links make use of the section identifiers that are automatically generated. That is, section headings are automatically defined as labels for referencing. Therefore, to reference (link to) a section simply involves using the target section header as a reference label in the following
[in text label](#Reference label)
The in text label is a word or phrase that should appear as the link in the text, and reference label is the title of the section you wish to link to. **Note, there should not be any spaces between the square braces and the brackets.
Arbitrary links to other internal items such as figures and tables can also be defined
To illustrate, we will define links to a section heading (#sec), table (tbl) and figure (fig).
# Introduction {#sec:Intro}
# Section 2
See the [introduction](#sec:Intro).
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Introduction
A simple reference to the [methods section](#Methods).
# Methods
A link to [the table][tbl]
Table: A description of the table
+---------------+---------------+--------------------+
| Fruit | Price | Advantages |
+===============+===============+====================+
| Bananas | $1.34 | - built-in wrapper |
| | | - bright color |
+---------------+---------------+--------------------+
| Oranges | $2.10 | - cures scurvy |
| | | - tasty |
+---------------+---------------+--------------------+
[tbl]: #table1
pandoc -o example16.pdf example16.md
pandoc -s -o example16.html example16.md
pandoc -o example16.docx example16.md
Linking to external documents follows a similar format:
[in text label](Reference label)
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
Goto the [Google search engine](http://www.google.com)
You can also use the following format:
Goto the [DuckDuckGo].
[DuckDuckGO]: http://www.duckduckgo.com "The DuckDuckGo search engine"
pandoc -o example17.pdf example17.md
pandoc -s -o example17.html example17.md
pandoc -o example17.docx example17.md
In addition to the above, there is a pandoc filter to use cross referencing.
We can use pandoc filters to use more sophisticated cross referencing. This requires adding the pandoc-crossref and pandoc-citeproc filters and the --number-sections extensions.
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
---
# Introduction {#sec:intro}
{#fig:fig1 width=30%}
A reference to [table @tbl:tab1], [figure @fig:fig1] and [equation @eq:equation1],
([see section @sec:intro]) or even [tables @tbl:tab1] [--@tbl:tab2].
$$ y \sim \beta_0 + \beta_1 X $$ {#eq:equation1}
Table: A description of the table {#tbl:tab1}
+---------------+---------------+--------------------+
| Fruit | Price | Advantages |
+===============+===============+====================+
| Bananas | $1.34 | - built-in wrapper |
| | | - bright color |
+---------------+---------------+--------------------+
Table: Another table {#tbl:tab2}
+-----------+----------+-----------+
|Column A |Column B | Column C|
+===========+==========+===========+
|Category 1 |100.00 | - point A |
+-----------+----------+-----------+
pandoc -F pandoc-crossref -F pandoc-citeproc --number-sections -o example18.pdf example18.md
pandoc -F pandoc-crossref -F pandoc-citeproc --number-sections -s -o example18.html example18.md
pandoc -F pandoc-crossref -F pandoc-citeproc --number-sections -o example18.docx example18.md
Pandoc can incorporate citations from any of the following formats: BibTeX (.bib), Copac (.copac), CSL JSON (.json), CSL YAML (.yaml), EndNote (.enl), Endnote XML (.xml), ISI (.wos), MEDLINE (.medline), MODS (.mods) and RIS (.ris).
The bibliography can be referenced either via a bibliography item in the YAML metadata or using the --bibliography argument to pandoc. This points to a file containing the bibliography.
Similarly, the citation style is determined via either the csl YAML medata data item or the --csl pandoc argument and should point to a Citation Style Language file. A large selection of CSL files can be found in the Zostero Style Repository.
Incorporating citations requires the pandoc-citeproc filter (and this must be included after the pandoc-crossref filter (if this is included).
Markdown (*.md)
---
title: This is the title
author: D. Author
date: 14-02-2013
bibliography: ../resources/references.bib
csl: ../resources/marine-pollution-bulletin.csl
---
# Introduction {#sec:intro}
@Quinn-2002-2002 described something important about ecological statistics in general.
Something important about generalized mixed models [@Bolker-2008-127].
# References
pandoc -F pandoc-crossref -F pandoc-citeproc --number-sections -o example19.pdf example19.md
pandoc -F pandoc-crossref -F pandoc-citeproc --number-sections -s -o example19.html example19.md
pandoc -F pandoc-crossref -F pandoc-citeproc --number-sections -o example19.docx example19.md
Ideally, reproducible research works best when the documentation and source codes are woven together into a single document. Traditionally, document preparation involved substantial quantities of ‘cutting and pasting’ from statistical software into document authoring tools such as LaTeX, html or Microsoft Word. Of course, any minor changes in the analyses then necessitated replacing the code in the document as well as replacing any affected figures or tables. Keeping everything synchronised was a bit of a battle.
Early implementations of reproducible research in R involved embedding chunks of R code between special tags within either HTML or LaTex documents. The file would then be parsed through specific R functions to evaluate each chunk and replace them with their tidied code and outputs in a process referred to as either weaving or knitting (depending on the function).
Over time the knitting routines (as supported by the knitr package) became more sophisticate. At the same time, knitr provided support for embedding R chunks into markdown. Here, markdown has begun to replace HTML and LaTeX as the base document because (as we illustrate above) it is both simple to use and can act as a universal language from which other formats can be generated.
Rmarkdown is essentially a markdown file with R (or many other languages) code embedded within specially marked chunks. Code chunks are defined as starting with the sequence ```{ and end with ```. For example, to define a simple R code chunk, we would include:
```{r name}
```
Any code that appears in the lines between the opening and closing chunk sequences will be evaluated by R. Similarly, other languages can also be used.
Importantly, to evaluate the code chunks embedded within an Rmarkdown document, the code is passed through a new R session. This means that although you might be testing the code in an R console (or Rstudio) as you write the code, it is important that the code be completely self contained. Therefore, if the code relies on a package or external function, these must be loaded as part of the script.
To see knitting in action, we will add an R code chunk to a markdown document. When we knit this document, knitr will convert the Rmarkdown file into a markdown file by evaluating any code chunks and replacing them with formatted input and output markdown fenced contents. Thereafter, we can use pandoc as we did previously to convert this markdown into a variety of output formats.
Rmarkdown (*.Rmd)
---
title: Example markdown
author: D. Author
date: 16-06-2020
---
# This is the title
```{r}
x <- rnorm(10)
summary(x)
```
echo 'library(knitr); knit("Example1.Rmd", output="Example1.md")' | R --no-save --no-restore
pandoc -o Example1.pdf Example1.md
echo 'library(knitr); knit("Example1.Rmd", output="Example1.md")' | R --no-save --no-restore
pandoc -s -o Example1.html Example1.md
echo 'library(knitr); knit("Example1.Rmd", output="Example1.md")' | R --no-save --no-restore
pandoc -s -o Example1.docx Example1.md
The above workflow is conveniently supported by an R package called rmarkdown whose main function is to act as a wrapper for knitting and running pandoc. As a very basic overview, the following would render an Rmarkdown document as a pdf file.
rmarkdown::render('file.Rmd', output_format='pdf_document')
The rmarkdown package comes with numerous output formats. These include:
| Output format | rmarkdown name |
|---|---|
pdf_document (requires Tex) |
|
| HTML | html_document |
| DOCx | word_document |
| LaTeX | latex_document |
| ODT | odt_document |
| RTF | rtf_document |
| Github | github_document |
| Context | context_document |
| Markdown | md_document |
| ioslides presentation | ioslides_presentation |
| Slidy presentation | slidy_presentation |
| Powerpoint presentation | powerpoint_presentation |
| Beamer presentation | beamer_presentation (requires Tex) |
Additionally, the bookdown package contains versions of many of these formats that provide support for more advanced features (such as captions etc). These will be highlighted below where appropriate.
To illustrate the basic use of the render() function, lets process the above simple example file.
Rmarkdown (*.Rmd)
---
title: Example markdown
author: D. Author
date: 16-06-2020
---
# This is the title
```{r}
x <- rnorm(10)
summary(x)
```
library(rmarkdown)
render('Example1.Rmd', output_format='pdf_document')
library(rmarkdown)
render('Example1.Rmd', output_format='html_document')
library(rmarkdown)
render('Example1.Rmd', output_format='word_document')
The remaining sections will attempt to provide demonstrations of Rmarkdown alongside concept descriptions. Where appropriate, demonstrations will be performed in both vanilla R as well as Rstudio. The Rstudio developers have gone to considerable efforts to integrate many of the reproducible research elements directly and conveniently into the interface. Where appropriate, expandable sections will be provided to give Rstudio specific descriptions/demonstrations.
Lets start by loading up an example Rmarkdown file
Describe knitting - blending in markdown and R code (knitr package)
Describe chunks
Describe rendering (rmarkdown package)
Describe some markdown
Describe Websites (render_site)
Describe Github
Describe docker